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This paper describes the implementation of a typical data-base 
marager fur an A.I. laraguace like Planner, Conniver," or QA4 f and 
some proposed extensions for applications involving greater quantities 
of data than usual. The extensions are concerned with data bases 
involving several active and potentially active sub- data- bases + or 
"contexts". The najor mechanisms discussed are the use of contexts 
as packets of data witn free variables; and indexing data according 
to the contests the-y appear in. The paper also defends the Planner 
approach to data representations against some more recent proposals. 
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i itirmootfiott 

Dne of the durable idea* in 4] i h the pat tern-or rented data baae 
developed by Darl Henitt far the original Planner programming language. 
<Heultt, 1371 > It hari appeared to nany oeople because it see its to be a 
good framework for implementing a "blackboard" communication channel 
between proceaaee. <NeuoM, I9£2v 

Sued a data case operates by indenmg records 30 that they may ae 
accessible with any of aeve-al different key pattarna. Thin Is th* same as 
the mere traditional "file Invoraion" <Rive5T, 1974>, ancept that the 
patterns arc more complex. The data base also providea primitive*, for 
copying arm combining col lections of data. These f acl I it lee can be used to 
implement controlled "deductive" procedures uhlcfi Irnit their aearches to 
formulas that have been included in the current local data base, and which 
are likely to match tha current goals. Uhen such data baa a a uere 
I mp I ewentsdl 4by various people in different uaya. e.g., ^Rulifson «£. *7 . » 
1S72> + <Suaeman and ncOerrnot t, 1S72>1 , they were found to help Overcome the. 
obstacles facing previous deductive programs 

However, recently ftuch data bases have cone in for son* criticism. 
Sons critics feel tnat sets Of pattern-accessible records *r* urofrrj (ar 
■ iaguided) as a data structure fur At applications. They usually recommend 
some more strong I y organized structure; they often appeal to the concept Of 
"frame," <ninaky, L374> 

This criticism is buttreased by the faeiing among reny people that 
Planner-type data bases are intrinsically hard to organise *ncept for email 
■toy" protolens. Scott Fahlnan c2375> has arguetf that the usual me the tie Of 
organising them ul I I break doun completely when ue try to lUplen-ant data 
base* of the size the huian brain nuat contain, 



Allnougn it is possible, and done all the tlie, to misorganlie 3 
Planner data base, I f«*l that the critics are being unfair* In fact, it 
seems 99 though the frame ide* flans naturally fron the nature of Plsnn*r- 
type data base mechanisit, including the ^content* sechanlsm of QA4 and 
Conn War. In particular, the notions 0* hypothesie-dri van recognition 
triggered by suggest inn demons ^Fahlman, 1973=-, of default values, in frame 
• Iota, expLainlng discrepancies, frame transition, *t£., sean to ba 
descended free ideas Uk* consequent vs. ar.'iecedHr-t rpa^oning <Hourtt, 
1^7L> H making as suiip t i on s based on partial evidence ^PIcD^r tig 1 1 , 1974a>, and 
debugging a lightly urong data bases ^Suasnan* 1975>. As fan as I can te I I T 
a frame is just a particular uay to use a Conniver context* 

llost frane proposals, houever, do not suggest further ambel [ i ahmenta of 
an already sound franework, but Instead tend to propose archaic data 
structures ulth lists of slots and values, or a 9-ualf Ht of interfaces and 
passible messages. The e "i I y reason for thia aeama to be a uideapread 
acceptance of th* second criticism* that a large Planner- type data base 
could not Oa i up lamented sufficiently. By "■ large" I mean large both 
physically, requiring secondary storage with today' a technology) and 
organizationally, requiring care to avoid useless computat i on. It is the 
purpose Of this paper to &hou that tfie case rs unprovan. Besides having 
th 1 s narrou technical intention, I intend to say a Mttlo about what hinds 
of operation* ought to be efficient In an AE data basa, and which proposed 
data structure might be the best. 



rt A TYPICAL DATA BASE PUMGH 

In this Met ion. I 4-1 1 I I describe the i irp I emon t a 1 1 an a* a typical 
Planner- Ilka data pase manager (TD3I1>. Ih* description Is draiin from my 
experience with several versions of Cmnivsr *r(eBeraott and Suaaman, 1973>, 
but Is somewhat idealized* ec at to distract you from mere I y haphajard 
fault! of existing systeme and thou you their deeper problene. 

_ 

JKA Behavior 

There are three important features of a typical data paa« management 
systan: "part iaUmalch" record retrieval, «ult;-pl data bases, and derived 
data. 

tr.A, I "Part l8F-fiatch" Record Retrieval 



Any record nay p* accasaed by specifying any sublet Of ita GDmpoTOflte, 
with each specified conponeM In it? proper position* The unspecified, or 
*oWt care,* part* may be indicted try question narks. For example, the 
rac-ard [P If a!l may ue referred to fry Heys (P 7}, f? [? a)) t (p [? a ))„ 
etc. ftecorda In the data Pass, called ittfli* are fjufexad uhen tr= C y Are put 
there la dried") ; and valMtexad uh*n they are remracf. The retrieval function 
*a FETCK which returns all the present i tens uhlch match its pattern 
argument. For e^anple, if IP U i>] , tP (f b> J , and fp d are present. 



(FETCH MP {f am return* HP If a))] 

(FETCH ' IP (f alU returns U 

(FETCH H IP (f ?m returns UP (f ail, EP If bill 

(FETCH 'IP ?ll retyrna IIP (Fa)], [P |f bJ>> [P cJJ 

(A note on notation) a» In HaCLISP ^wrt.LS?^. H 'S-eKpressfon" is sn 
abbreviation for "[QUOTE 5-BxprB&siony, m i 

Sometimes I «ean by H aateh H that |h&r^ it EDlie mapping,, from 
ocCUrr-erncae nf '?" in tne fetch pattern tn S-BMpretiiens. which Hakes the 
fetch pattern equal to the item ratj-leveO, In practical systems, the 
matcher is. Kara cofnpi i cated. tt naj be thought Of am a unification 
algorithm ^Robinson, 19GS^ K in which labelled "don't cares" play the role 
of varrablaa. Far example, (FETCH ' 1Q TH ?X1 1 ahpuld Find (Q a ail but not 
[Q a b). I ui I I more or less lynor* this kind of matching in thia paper. 

However, there rt an associated peculiarity I ftu&t tnantlon. Our TDBfl 
■uat allov "ttoh't cares- in tn« i ten records. This is because ueera will 
want to be able to model propositions H«th items, and need To have fre# 
(universally quantified) variables In Hem like CtJS HAN Wj -, [MORTAL 
?X)). For our purpoaes, this i»§u be treated as U\% HAN ?} 3 (nHFlTAL 7M. 
Ue uant to be able to find It m th a fetch pattern I iKa- 6? 5 [AORTAL 
FRED)] . 

ii.A.2 Multiple Data das** 

in At applications, there are several reasons to have many different 
data basse tn SEleet from a", a given tire, HoMBver, theae cannot ba 
represented 6y anything a* clutisey ae separate file* or indexes. It has to 
be cheap to copy an entire data base in such a nay that routine Changes to 
the copy (additions and renewals of data} do not atfact tha original 

Thia (9 done hy repr*sont ing a data base at a Mat cf layers uhieh 
represent the hiatory of ehancus to an original eapty ait* With each layer 



are stored the differences bettJS-en Eha data bases that contain, it and those 
that do mot. Strch a list of layer* la cal ltd a C*£ {pronounced like 
'cbntext, 1 ' Ccnniver'a nLtttfding termK t Ml II raeerva the term "data 
baaa" frnft non on to mean tnn -set of all Oata in all cxts. 

For ajOHpla, a problem solver nay node-l tine as a sequence Of e«ttp 
taeh with Ofl* Wr< layer. ]n "XTl, it may ha^e tho i tome ION A B) h [ON B 
TABLE), £0N C TABLE}. rejpra-sant mg the a«no of Fig. [I.Hal, 
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F i gure 11.1 

In contemplating in action^ tha problem aclver may tiake a neu layer t number 
2), a#id "copy' CNT1 by pushing tho new layer onto it, making £XT2. Nou It 

it REhDV£a (ON A B> and ADDa [QN A C) with respect to CKT2, all other data 
■re undisturbed,, and Cfc'l's consents are unchanged- Thi tMo changes 
"remove 4014 A 81 " and 'add (UN A CI" are associated with layer 2 t and thg* 
uith CXT2 and all further Cxti derived fron lt + (Another use la to model 
"hypothat leal" worlda in uhich gome asswnption ia assured temporarily. For 
more detail- on I up la/mutation, see ^oDei-mo 1 1 and Sussman T 1973>.) If the 

lauara of t UD c*ta Cl and C2 are wen that all the layers of CL are in C2, 
CL ib called a supsr-c*t of 02. H hich ia a mb-cxt of Cl. In my example, 
CXT1 la a 9uper-ent of CXI2. 

Layers are i*t1 1 behavee enouajn so that neu cxtl aau be formed by taking 
the union Of eats of them, (Th-ie fact Is Often overlooked in dl scussi one 
of CMtBj to that a *?xt trtt" Created by adding hbh layers ia taken for 
granted, I I ui 1 1 deal uith this featiirt at length in &ect + IV* 



II*A,3 Derived Data 

The atructure | have described 1, internally complete, but I must nlM 
on* ttore super « true hire upon it to Fit it Into the usual tradition. 
Planner- type data oaaea a I host alugy* h*ve built-in mechanisms for cal ling 
simple "deductive" procedures, One Mnd. the JF-NEEDED method or 
consequent thiOram, Is supposed to work u] tn FETCH to generate *virtiJaf 
item" which might not be thara until the FETCH happens, t Mil I not say 
much about this type, since the isauei In implementing it are noetly 
centHrcd oti huu to Control a generator co-routine. 

The Other kinds are Implemented as aisftugi-a nj ntcrrupta M of B<jd jtiona 
and removals of Items matching a pattern. Theae are Conniver + e Jf-ATOBO 
and IF-RE"Q¥EQ method types. For example, in pidgin LISP for the TDBM« ue 
(light urite 

(if -added (i B rose ?r} 

(add IMeolOr 4r red)} 1 

(The ["-notation. Indicates n euaBi-quotaticn"t the value of \ h $-ewrts&1an 

is S -express ion with supparta narked by V replaced by their values.! A 
procedure like thL ia present in a c*t juet like the iteia it interacts 
with. [ADD Henri calls [FETCH fMIF-AttJED etten ?H, and evaluataa the 
third slot in the Returned I tens. 

These intsrr-upt methods tend to be used in tuc ways: to implement 
uaaful forward deduct ionat and as genuine Interrupt* requiring non-trivial 
attention by the calling program. The fomer ll i I luatrated" by eu "roeaa 
are red" e*ample r the latter, oy a nelnod in a problem solver that notlcee 
uhen a "pro t acted" goal ia being cloboered* (Cf. <5us B man. L375>> The 
latter, the "true interrupts," pose no problem from a data-base management 
point of viflu. [ will ignore than. 



The others, th« "data derivers," fall into two classes; i f -added* that 
add consequences of (hair trigger r terns; Arid if-removeds that dean thorn 
up, I ike: 

IH-re*1o*ed (is rose ?r( 

IrsiiDYe !"{eelor «r red!) i 

] ui 1 1 idealize this situation by assuming cms kind of data deriver 4 an 

i tern of the font 

(ANTEC-[TEn Input output) . 

("Antec" is de = cerded i-cn Hcui tt ' s <li]/l> "aitetcdcnt theorem.*] For 

exanple, ue can ha J* CANTEC-tlTH (IS FMSfc ?R\ [CCLOR TH RED] I . LJhen AOD 

adda am i tea matching input, the matching substitution It applied to 

Otltoi/t, and adds \t t too, Removal worns aaynmetr ical ly. In t*i i b scheme, 

ADO makes a not* that the auppor fc for, sau, (COLOR RDSE#71 RED) in this, c*t 

is the- presence of (IS ROSE FiDSEff71> wd tANTEC-ITEfl (IS A0E£ 7RI (COLOR ?R 

RED)). This not* il attached to a I I three items, Uhen REHflVE flushes an 

lt*» T it flushes as us|l all items supported by It that have no other 

support. Thii deviates ciean-up hethods. 

The details Of this someuhat aroitrary ac'nene ArB unimportant, Out tha 

notion of distinguishing true Interrupts fron data d-eriveri ha* been shown 

to ba useful in prac'lte* 

Il.B taplencntation 

In the TOSfl, there are trrai ant i ties that stand between the cal ler of 
FETCH and the data* the index, the e*t fitter* and the matcher, All 
prtfrfoutlu-mBntionad da! a are in the Index fuhlch behaves much like the 
Lt5P atomic symbol array!., but a given call to FETCH should return only a 
handful. (5-se Fig;. [1.2.1 
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Figure U.2 
The la»t ttip in Fig. 11.2, which rejects (P a n) when tha fetch pattern la 
(P ?jt ?k), ib not af interest to un, Ub focua on the others* 

Notice first that Cute are i np I erien t e d aackuarda., The system duea not 
look iri a cut far a datum it tanks on a datum to sea If it is in the 
currant cMi In the inde*, an I t«H ift Stored aa an ften da/tuPT, of the for* 
[ iteP) -cmarkers-) , The r,markers + nr»e per relevant I aye:-, aay uhich cvts 
tha Item has been- added ta or renamed from. ^tlcEerhot? and Suaaman. 1973> 
These markers an CDmpared ui th the current cxt (tha two Hats being kept 
aorted by layer nunaer) to ate if the datum ia actual j there, (Datura 
"properties," such as i»hat data support a datum f may also be atored in the 
cmarkeri.) 

The marker syatee treats datum presence aaynmetr ical I y from Sbaence. A 
marka- for a c*t layer ia attached to a datun ntily If the datum was added 
uith respect t? the layer, If it is later remaned from the layer, the 
earkar la just thrown away, htauaver, If It is removed in a e*t derived by 
pushing nau layers gnto the urigina : one, the ayaten must cancel the 
marker. For exanple, If ION A B) la narked aa be I nq in layer 1, then 
REMOVlng it In cat (3 2 1] »arka ([N A B> Vesa-nt in 1 aa cancel led in 3* " 
Any ext which include* layer 1 Mill contain EON A S) unlaaa it includes 



layer 3 a-laD. 

Thin *fiaekijardl" txt echene lg efficient enough for amal I dat* oaeea, 

but far large ana a it could be embarrass!^. This la the topic *t Section 
IV, 

Even tor amal I data SasfcB, it is inporTant that the inde>- 'etcher 
reject 3tl but e enaJ I Met of candidates. Etching tenda to be an 
expensive uay to raj eel art i tea. ) There h,gu» bean many way* suggested to 
do this. The following is a comoosite. 

Every iteH ig indexed a\j its f*#tirres. A feature is a pair <atora, 
position*. For raaaple> [P (t ?)1 has features tP, CAR*, <f, CAAflfl>, <? t 
CADADR>. Una positions ara, of course, really represented as bit •tringt, 
but ue uiE| ute UaP CAR-CDR conpo»l t ion*, with "CR" standing for the empty 
atring that represente the top-jevel position.) For each feature, there ia 
a hucket of i tens with that feature. {Th is nay be implemented using a haah 
table*) For example, the <P, CAfli bucket will Ins I yds i tarns (P al, tP (f 
t)U , [P m n o p) , if these arc present 'n any c«t in the data case. Thai 
<?, CADR* bucket will contain iterte like (P ?x) t UQ b) 7gJ T fP ?* if c) ) , 

BtC. 

The function rtCEM-FETCH takes a pattern and a position, and returns, a 
liat of candidate* that conta-ir^ In that position, a structure which could) 
match pattern. Thus (INDEX-FETCH * pattern 'CR) returna all 1 tana anywhere 
uhith could march pattern. If the pattern 13 "?, " all items h U3 t be 
included: this is indicated by returning a syirsjc-l for the "universal 
bucket. n Otherwise, th* returned list must a I naya include the members of 
the <?, position* bucket, [f th* pattern la an atom, the union of tho <?+ 
positions and -cfttom, pizMian* ouckBta ia returned. INathing else could 

Uherv the pattern 1 3 r.sri-atomic, IWHEX-FETCH must examine the left and 



right "e* tensi ons" of the current positions that Is. the tfifl and COR 
relative to the current petition. Far example b (tNt£X-FETCH MP 7) - CR] 
Hill call (INDEX-FETCH 'P 'CAR! and [MDEX-FETCH *!?} h CDR>. ' The pattern 
Of ttil It may be represented &y a tree "i soaiorph 1 c to the original fetch 
patterns 
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Figure It. 3 
Remember that at each "care" npde the "don't cart" bucket nust he unioned 
Tn. But first, what dpea the system do wl th the results of tha two 
subcaiij to iWDEN-FtTCH at eacn level? Ccncepiual ly, thedetirsd result it 
their Intersect i an, the aet of all items unlch share tht feature 
combination 5 nn each subtree, [This is actually done in the current 
Contriver, under eone nl reumstances.T However, it is too painful in a large 
data base ul In many. Hens. The output of the Intersection may be quite 
smalt, and U hi 1 1 have tp be redone every tine thare it a FETCH. 

A cheaper procedure, "0*t Of tha file* is juat to take the shorter of 
tha tuo tickets at each stagtn This leaves in soma I peers, but it may net 
be worth it to get rid pf then. For exampli, given the fetch pattern (P 
A] t there may be many itena -starting ul th P, but only a feu ufcth an A in 
tht CAD3 posit ipn. Fer nou T i*t us aasume it is cheaper to filter out tht 
■ osert with the c*t and; matching *iachanisHs than dn intersections ui th 
arbitrarily long buckets. 

The mechdnigm bo far thus selects 3 branch of a tree like Fig. 11.3(a) p 



and returns the union of tne buckets on the branch. [Th e r» ia no reason 
actually to dta the union until it return* to the top le*et.) Wstit* that* 
Since the tree la aearched depth-first, the search daun a branch can be cut 
0)*f at soon aa the ay* of the sii** of the buckets en«yntsred exceeds that 
O* the beat previous branch. . 

In a very targe data base, taking the shorter of the !u« competing 
bucket unions nay not 00 enough. If thw* are two competing features, in a 
fetch pattern, and each nas a large bucket, but tht Intersect ion of the two 
is aiall, It is costly to return iithar one or their {expensive to coeputal 
inteaection, Th* usual solution ia to t*ne one of thfl offending Pucfceta 
and break it olOnn by features again, to produce strD-iy^iet-S correspond i ng 
to paira of features. This amounts ',o computing all nen-saptu, 
intersect iona in aouarce. For exatible, if the indexed i tuna are [(P at* (P 
bl, {P c) T IP dr, (P e), (Q at, IA a) , IS a», (Till, then the <P P CAF> 
bucket U [(P a), {P bJ T [Pd, IP d), IP *>l, the <a h CADR> bucket ■ e I [P 
a}, ID a), m a>. (S ak< (T ail, and the <NtU, CQDfl> tucket incluaeg ell 
the data. The best ( UsDEX-FETCH ' [P A) h CftJ can ft it find the two B- \ tm 
buckets- Aaauning 5 exceeds the thfiShbld of bad taste for bucket size, 
th* i™Je«ep defines tc break down (he <;F, CAfl> bucket , 
Whan it is done 1 the index contain* the following: 



Buckets 
<P t CAR> 

Sub^bitCkets 

<P P CAH> MP a], [P 61, fP e| f (P d), {P ft] I 

ca h C*DR> UP aH 

<=b, CADR> UP bU 

ce* C*DR> UP eU 

<d t CAOft> UP dU 

c», CADR> HP *]] 

<HIU COQR> UP a], (p at, <F c ) b £P d) , CP sit 
<u, CAR* [<Q aH 
<R f CAR> UH aU 
*S, CAR> US all- 
<T t CA«> UT a)] 

■c*. CAOR> HP e*, (Q a^ (ft *l p (S a>, (T a>\ 
<D, CWR* (IP bf| 
<c, CAOR> [IP c>( 
«J, CAOR> rtf dU 
«. WDR> HP *U 

■cNIL CDDfi> f(P a), (P b>, <P e), (P dl . (P ») , (D a}, 
(fl a), (5 a}, IT ah 

Figure ][.<* 

This procedure ia cal J*tf r&h&sbing the. <P, CAR> bucket, Fron now an, any 
reference to the -d 3 , CAfl> bucket is handled by finding tha relevant sub- 
JJUCKwts niven the rett nf the pet Urn, and using those instead Cand 
ignoring the corresponding Duckets I n tho fcain index, which are atuaya 
Digger). Consequent I y + buCk.it and sub-bucket k C yg rujst be generalised to 
sequences of features, like «P. CAR> h <a, Mpa», In this uaxj. the 1 ndew 
tends to organize itself into efficient aufc-indexes. If necessary, tha 
process nay continue, generating aub-aub- indexes, etc. (with the. sub- index 
coming nore and more ta resemble OM'o discrimination net index ayito* 
<Rullf*on et, «7 Tl 1372*). 

One drawback to ny scheme i & that aone bUSheta created this Nfly mi I I 
never be used. For ex5rrple b in the M*iplg index shown., the only reason to 
h.av B a bucket for the faal M r# <:a, CADfi> In th, main index a* u« I I as in 
various aub-inflene* ia the possibility of a futur* fJFETCW ' (? a) J, If the 
user is sure ia sure all future fetches ul U mention P, P. R, *tc* 
explicitly, the buck*t ia useless to hinu 



The bo I ut I on it to Mfce the seething, indexing, and un Induing 
routines data-driven in the sense (hit their default Set Inn* nay be 
Dverrrdden by routines dependent on the data being handled. The simplest 
achomc is to associate special indexing; routines uith the CAR Df an I te*u 
For example* the routines for atoms r i H.e P might speeHy skipping. Indexing 
and fetching on the CAOFt poartion at ill, in the case where 3»l FETCHee 
hava a ■■?" in the CADfl posit ion* 

Data-driven induing is helpful in other nays. A form like QA4" 9 (SET 
elementl ... alertentZJ may be indexed In such a uau that all the elements 
3re associated uith the satis [CfiDFI) positron. Then a special Set matcher 
can be used to procett FETCHed seta, [f the set [SET A Bi re fished for 
with the pattern fSET 3 . ?) , i t u ' I : be cv^r looked unless something like 
thlg it dene. A. more prosaic B*arrole is voiding looking for «?. pos> 
buckets Utien they are known not to exist. Ordinary Planner and Conniver 
programs usually have great Quantities of dull data 1 1 k* (VERTEK 0.5 -4,7), 
Mhere it is uaatefuJ to have to look for the eapty <;?, CACA> aftd <;?. CADDR> 
DUCXate. 



Ill THE SYM&QL-tiAPPIHG PHOBLlH 
lir.A Introduction 

□ ifficultie* of scale Uith the typical database rtanager appear in 

connection with Fan I nan 1 a <197&* "BymDal-^ppinn" pre* I Oft. 

Suppose I tell you that a certain animal — left call hir* Clyde— re 
an eleohant. You accept this simple asserlion and file it away uith no 
apparent display of rental effort. And yet, as a re 3 glt of this simple 
transaction, you suddenly Senear to kn H a great deal about Clude, If I 
say that Clyde climbs tress or pla V 5 the piano or lives in a teacup, 
you mil immediately begin to doubt rty eredibi r i ty. Somehow 



"e I epKgnt" it serving as r*Dre than a here lacel heref it ls> in son* 
sense, a uhoke oackagE of properties a^d relationships, and that 
package can be del iwered by means cf a aing.it tS-J *tat»Wnt* 

in principle, such behavior can, bs achieved through trie use of some 
♦firm of deduction. Each tact is a eeoarate entity, and n fl u facta }r S 
produced by Knocking Ic-gethar tua old ones. Thu*, if «o have- "All 
elephants have ur ink I as" and "Clyde is an elephant", ue have th* r«S h t 
to qecIuce that Clyde n3s urinKles. [n one tor*i qi- another, this has 
been (ha standard A] approach. 

But having the right to deduce soma fact i a not the sama aa having 
the job done. Huch ingenuity has b^en demoted to the search for taat 
deduct iva reach an i sns r but the problea renains: Intractably 
comb ina tori a I . 

Thsre £ra tuo problem described here: bringing rn large amounts of 

typi ca : -el Echant knotj I edge, with, loosely speaking, typicS' eleohant "bound 

to" Clydet and dstecting. inplauai Hi 1 1 iea ["Clyde climbs troe*"} when they 

come up. 

The ascend ia a seriout problam, whose solution dependa on the 
i ■ 

particular domain it is drann fron. Ue should not expect a ran data-base 
system to perform unassisted 'ask? n*a choosing betueen tup 
interprEtations of "fot" in, "The monkey screjned St the elephant aa ha 
cl imbed the trea.* 

But I feel that tha first p^oben should be made airaple by such a 
ayBtem^ Let us *Hawirta tha three options tor solving it that a Planner- 
type syaten offers use 

[LI Ue COuld traat 

(!■ elephant ?*] 

3 {color- ?x gray} a [size ?x biflJ a {diet ?k plants) 
a [is eamual ?*) n. , , r 

as. a large W+TEC-JTEKl (Sect, U.A.3K Uhan (IS ELtPHAUT CLYDE] Is added to 

a c*t. the nenlu-tonsBd fact's (CDLW CLVDE GH;ri , IS32E CLYDE QIC), ... are 

added at Ufll. Unfortunate I u, t thft io likely to uasts a lot Of time and 

3 Dace, Sines the number of tacts k,noun about elephant a la large, but only a 

feu of them ara aoing to bs used on any. occasion. Further, fact* like (IS 

HAHriW_ GLYCEJ are I ik.eiy to cause nany moro facts to be deduced* 



12) U 8 could wait until a retch of tCCLDP, CLVDE ?CJ was attempted, and 
propose {IS ELEPHANT CLVIE) ag a gubgoat (vifl en iF _^ EKD „ othod Dr 
consequent theoru). Obviously, heaver, ther* are aariu squally good- 
looklng subgoals, like (IS BOSE CLVDE I pr 1 OCCUPATION CLYDE CH[rlNEY-5UEEF) . 
for the system to uaste its tine going through. 

Another approach I Ifci thle tt to aaarch through all thingB that Clfcde 
IS uhjn hia color it wanted, looking for an assertion like (COLOR ELEPHANT 
GFAVK For thfs saarch to i*ean any thing, the IS relation ul I I have tD 
carry th* burden of representing host Inf artiat i on in the system, indeed any 
piece Of information fro* uhich inferences coyJd have been *ade {in this 
case, about COLOR)* For example, one wo<uld be forced to say (IS CHlflNEY- 
SUEEF CLYDE* instead of (OCCUPATION CLYDE CHIflNEY-SUEEFh or the 1 tea 
{COLOR CH3WEY-SLEEP BLACK) wutd not be found, Although the approach can 
be useful Hhen the uee of [S is carefully tailored for a specific 
application, rf uued profligately It turns "IIS" into a mere syntactic 
eyabol, a left bracket like "{,- uhich indicates only that some inference 
might ha*e D*en draiai FrMi what it delimits. The method than becnaea 
-search through (al«0*t) everything known BbOut Clyde to sea 1 f a COLOR 
twni up," itftfcch cnuld not be efficient on a nomal computer, (Of course, 
it is an opan orgeat ion whether all useful AI inference can be made to fit 
the MIS. ..P syntax. Cf, 4Juods, 197S>» Another difference between "MS* 
and H <" uduld be that IS presto ly would signal that its aeccmd argument 
ia uhat the assertion ie "about, " For akanpie, if Oy accident ue had {IS 
IDCCWTHH CLYOE) CHIFINtY-SUEEP). this fact would not ba "afiouf Clyde, 
end uould not be consider^ 

I3l If toe rlght-nand aide of the I implication could ba represented as a 
ewt layer' or set of layers, ut could just marge these layers into the 
current c*t. The prcblea tilth this is that 'CLYDE " does not appear on the 



right-hand Bide. He really uant a 'closure* Of a cut. analognua to 
clMvr « lr lans^Sfca like PQP-2 <Bur 3 tall at. sj,, 1971> ar>d LISP 1,5 
<;Levin at. a/J P > 19G5>. in uhlch a free variable I ikt 7X is bound to 0_«3E 
hi tli ainlmal cnat. 

Approach <3J la th B one ue u i | | explore, cut it it north examining in 
detail why a Connivar-ety I • cxt ui I I not uor*. The user can't replace- ?X 
uith Something like THE-ELEPHANT . f 0r tuo reaaona: there *ight be nor* than 
one elephant; and he has to begin referring tc Clyde a? THE-EIJEPHAN f . Thi» 
aecond reason la a prop I a* oezaus* some nf the facta involved night refer 
to Th£-riAfflAU and because two prccBB-ges In thi ueer's proclen aolvirtg 
program night haw* trouble C0Miun"-cat ing H ith each other, each having 
focuasect on 3 different aapact of this creature and chosen a different 
name* 

Theee facte hurt, net Only because they blunt Our attack on tha aywboJ- 
■apping problem, but because they appear- to pulF the ruff out frotr under 
efforts to implement a Miriafcian frame aystem J1inaky H 1374 > in a Planner- 
I IkE way, 

Houever, there I 5 a yay to debug approach £3), which makaa a Planner- 
type data base a viable candidate ior a aubstrato for frame*. 

I II- B Potential Item 

The idea la tn Inde* an item like 1C0LQR ?X uRAV) with ?K as a 
variable, but mark i t as a m e r B potential »«n. Uh,n FETCH find* such a 
potent 1 at Mem, it 15 to "sriaah" it with value* Of ?X co«*reaponding to 
creatures whose elephant -hood have been assert ad. tn addition* the item la 
to be present only in cut* Including the 'elepnant layer," uhich it the 



representation of the right-hand-a'aa of our large implication. This layer 
is included only 95 long aa there are elephants around. Such a layer (or, 
mora generally entire cwt) , containing potential itens referring to 
elepnant. is- cal erf a picket, after Fahiman <1973> and rtareU'i *L374>. 

In other words, [ an trying to *ak.e the aye-tert F cheat n by substituting 
values for variables In the right-hand Sid* Of a large i mpl i cat ion before 
the aubati tut i an to do it ui th is known. In lieu of that SUbSti tut Ion. a 
set of temporary bindings to dumiiy . quan t i t i es Ilk* WABSTHACT ELEPHANT! is 
used* Since the index er is data-driven, it can be instructed! to index 
[flASSTRACT , . t ] ae a variable, 1 ;j i I I < ideate this binding to a quasi - 
constant nith. the prefix "Wtl," 

As an exaitple, let ue. exanine th* elephant packet, ]t contains the 
potential items (CQL3R ?*WELEPHANT GRAY)* H 1SL2E 7JS*£LEPHA«r LAflGE)*. ate. 
(1 ui I I eerk potential i teraa with SO "*-*) The syateii gives the packet an 
atomic- name t like ELEPHANT-PKT. and treat* i t Sh a predicate nith the free 
variables as arguments, (BLEPrWNI-Plfl ?& f This allatie it to process, items 
like WNTEC-ITEil (IS ELEPHANT ?W 4ELERttNT-PICr ?Hl) F Unm (IS ELEPHANT 
CLYDE} Is added to a est, it triggers the adding of [ELEFHANT-PItT CLVDEK 
The ayateii indexes this i ten as usual + but notices, that ELEPHANT-PKT is not 
an ordinary predicate, but a packet, so it incluitti the ELEPHANT -PICT cut 
layers in the currant CKt H and notes the svbs-ti tut ion of CLYDE *or ?X. The 
item EELEFHANT-PKT CLYDE h is a packat-elttsure. 
how when a FETCH of any of 

(color Clyde or ay} 

(color fred gray I 

(color ?z gray J 

(color ?$* ?C> 

(color c^yd* ?c) . etc. 

is attempted, the potential i ten [COLOR ?ffl(ELiPHANT GflAVJ* will ba found 

and actualized to ICOLDn CLYDE GRAY).' Thia neu item will appear in the 



local Cjct T Of course, not In the elephant packet. 

If £ES ELEPHANT RALPH} is added, thi ay a tarn adds I £LEPHAIMT-P»a RALPH], 
On tha nkMt fatch nf <C0L0ft .<.], both (COLOfi CLYDE GRAV) and [COLOR RALPH 
GHAVJ uil| be generated. [The jay the sy&len finds all bindings, of 
course, Is to FETCH {ELEPHAN7-PKT ?Xl and use the retrieve-d «ubst i tut i one. > 

Of course, this is uasteful as It stands, since the syate-n doesn't get 
btyond tht potent iai l ten. In fact, it must replace it uHh its 
actual izationa. (It »u*t be cautious e-nouth »o r Q nsFoer tha 1 : it did that, 
40- it can reactivate it in case another elephant appears.) 

Finally, "data-depefldertCu mates" like those of Sect, IE, A. 3 ftust be 
attached to actualizations |ik* [COLOR CLYDE GfiAYl , 90 that if (IS ELEPHANT 
CLYDE} is; removed tram the cut, the actus I izationa can be removed, too. 
And Whan the laat [£LEPHAOT-PlfT ...} packst-clesurt If gone, the layers of 
the ELEPHANT-PKT must be excluded 1 fron tha currafit cxt. 

. 

IIKC Further Enbel I ithhents and Carmen* a 

Uhetl fl packet is being built, it is desirable to be able to make 

deductions fro* Its potential item* as they ara added. Fo*' OKMple, It is 
1 tnbor tan t to be able to include a closure of the HftHIIAL packet in ELEPHANT- 
r<T yhen it Is built. So at this tint the system aust treat 7WELEPHANT as 
an actual constat that can take part in matches, ate. Uhen f]S flArtrlAL 
?#*ELEPHANT} I ft added, it 1 4 not. treated as a potential i te-ft, but cause a 
IMAnrWL-PKT-?#SEL£PHANT) to be included in the ELEPKANT-PKT act* This ia 
not just applicable to transitive IS deduction* H uorke for ordinary 
antecedent reasoning, and for inclusion of other kinds of packet* Esuch aa 
the TUSIC packet, tuo el causes of unich night be included in the elephant 
pscketl. 



Thus there will &e dftpendancieii a&ong the potential Hens. For 
example, ICAN ?A#ELEPHANT (HIDE-IN F[L[N&-£ABII«TSI >* might be »&rketi at 4 
consequence Of 1D0L0A 7tftfELEPhlANT GRAY)*. This ul I I not matter Much unleas 
some user of the packs t tries to reiigve tin CCLDfl iten for a particular 
elapnant. Thin i-a intuitively a couple* operation, *inca the reason* for 
and consequences at an alteration t* a atructured data base nay bo 
convoluted, However « eot,e aimple bookkeeping can. be done by the data bass 
manager. Uhen (COLOR CLYDE GRAY) is removed, (CAN CLYOE [MDE-JN FILING- 
CABINETS)) should go, too. The situation m3g be represented as loWpust 

ELEPHAMT-PlCT 

(color ftWtlaphant gray I* > [can ?lWelEphgnt 

(hide-in f i I tng-cabineta) )* 

! ' 

i i 

I talephiant-pkt felephant-pM 

CI yds J | Clyde) 

i • ' 

CURRENT | I 

OCT y if . 

Ceo I Or Clyde gray) , > ( ca n Clydo 

Chide-in f i 1 1 ng-cobineto)) 

Figure I] [.1 
Even though the broken lines are nut yet present, they ilu*t be put in. 
Then (COLOR CLYDE GRAY) and its SUpporteea can bs removed as dSScMbed in 
Sect. EI. A. 3, The actualization links Hust be marked so that future 
fetches Hill knou not to add th* actual I zers again. This process is called 
rfetach-fns the iten 0_CR CLVK GRAY) Iran, i * 5 packet. 

Thi* ■echinis* [which in ita full generality is tiOra couple-*) can be 
used for building packsts define* in terms of ether packets plus certain 
exception*. For nanple, AL9)*0-ELE a HANT-PKT may be defined as the 
ELEPHANT packet out with one I agar added, in which [COLOR ?iS/W_Bl NO- 
ELEPHANT GRAV)* u removed and [CflLOR ?ff*AL&IW^LEPHANT UHITE)* is added. 



!f rrS AJ.B1H0-ELEPHAOT CLYDE 3 is adid*d | P a cxi ( a FETCH of I COLOR CLVDE 
CRAV] find* the following situation) 



ELEPHANT -PICT 

tea I or 70tfcleph3n1 pray)* > lean ?Jfl!(g I ephant 

Chide-in f E 1 lng-ceclnet s) 3* 
I 
I 
I 

c (elephant-pM >| 

?fl#a lb i no-el ephgnt )■ 



*£XCEPTT0N* 

ALB (NO - 
a£PKANT-PKT V 

{(color ?*ffalfrir.D-eleDhart 
g-aylstfABEENT*] 



I 

|*EXCEPT1DH* 

I 

I 

V 

[(tan ?Ma I b 1 fig-s I ephan t 

Chide-In f i I ing-caPine tsl 
*A9SBNT*J 



CURRENT 
CXT 

Calbino-elephant-pM Ctyde] 

Figure I II, 2 

FETCH finds (EQLDH ?M#ELEPHAKT GRAY) a. but in attempting to fintf all tha 

• lephsntfl, it notices th* *EHCEPTION* lints throufln the ALB] MO-ELEPHANT 

packs t, and I saves Clyde eut + 

Th* " p-D t en 1 1 a I I ten" lupLttientation of pecXel-clOiurse has the feature 

that it Is Independent of a particular relation like set-tncluaion I"IS"J + 

For example, a large implication like 

(loves ?x ?y] 

3 (jealous-of 7k ?y} r> tadmires ?k 7yJ 
A {habitually 4 together ?* ?yH 
n {when Ido ?m (for -gone movie 

Etanbtla M [attend ?a ?■))}] 
(presumably (acca.Tipani es 7yj ?*)) 

can bs handled easily by creating a < I oys-jf fair" packet, containing 1 tenia 
Ilk* {JEALQUE--PF ?##LCV'Efl 709L3VZE1*, in 'act, Implication itsslt Is only 
Incidentally connected u!th thg ubb of packets* Any large conjunction of 



1 tems sharing free war lab I bb way be represented as a packet. 

If th* reader is fee- ling optimistic about packet s f here are gome 
cautions. First » a! though the 'senaitic baggage" p^cbltn dorian' t look so 
bad any more, the other protlsrs Fahlnan is concerned with are not attacked 
directly by my snlution (8 it, In particular, typical consequences tft 
b-om thing 1 • being an eleohant are ngt automat icaHu, usable in recognition 
of elephants. Instead, elephant recognition knowledge will be contained 
in, aay, the jungle packet, with the typical frame organization. JThli i» 
9S good a pi see a* arty to point out that ny absurd examples in this Chapter 
are not represen tal i ve of use f u I' a I ephan t knouledge. but only illustrate 
forma I prob I k*i s and SO I u t i on b, I 

The bigga»t dnauhack to my s^atei is that It r«li«* on a packet- 
ctoBunB 4 auch as the one expressing Clyde's a !eohanthood< being a temporary 
structure, freousrtly aPcec% reiioved, or abandoned. Otherwise, if all 
knoun internet ion about all known elephants is always around. It ui I I tafco 
too long to smash a potential j tgm like CCCUQR 7#dt"LEPHANT GRAY)*, and most 
of the products of the effort will be useless. Nets that there ia realty 
no difference between '.he potential item idea and a particular Hind of 
consequent ■! IF -NEEDED* reason i no.. Thie reasoning )s i "to find Clyde's 
CO I or > find what types of colored entities are around, deduce what they 
are N and see if Clyde is one of then.* ThrS dsduction is done efficiently 
by foil owing the actualization pointene down through layers of packet (Cfi 
Fig + 111.21, and we get tha accent ion median isn cheap ay navittg the System 
do it this way, but the corbinator i 8l limitations are the same. (1 believe 
that even in bad cases this method it superior to the other approaches 
mentioned at the beginning of this section, which involve proceeding from 
Clyde to COLOR. Fly approach a I nays (ticks to oathways which are guaranteed 
to lead tn present CCLOA itens [or except iom)i Furthermore, once they 



have been found, the potential item that generated (ham ia gone. I 

In fny owrt research on electronic design <f1cfJer.io 1 1 , 1374b*. this 
attueption that only a feu clnsura? p* p packet Mill ae around at one tins 
it justified, A cloture of the h af-anpl i f ier" sachet Mill be made and 
Included in the cwt for a radio receiver uhen an af smtfE i f isr is needed* 
But it will not be viaiole i r< c*ts for later taaks. | thin* tliie eort of 
Organ! zat ion ii typical af problem solving Byatama land is crucial to the 
frame theory that moat people accept] t but there nay bo Situationa uhere it 
is inappropriate^ [Sea Sect, V.) 

I I I.D lmpi amen tat ion 

Althcugh it J# too early to tell hOU Successful my packet schEne can 
ba + I have inplemrntod a preliminary vertion of i t H the PANACEA data ba:se 
nanager. ]t l» baaed en a nodi fied Conmivar data base with a caata-drlvflo 
indexer but no tub- i ndaxi ng, whicn it usee as an aasneiative memory and 
ittm uniquizer. [t buffers the usar from "ho actual data by smashing all 
potential 3 Ism data a& described above. 

Th D RflNACzA system itself is quite snail, although my programming style 
has updo the TDEfl and other* gyvteti prog^ans it pellet on toe big and 
clunay. Host of the complexity 0* PANACEA is in the routines that maintain 
data dependencies and e*£epticn links in a consistent itate. The actual 
P&ttfit ial-i tern snaaner occupies tuc c- th^ee pagve of LESP code. It it 
. hard to t* I I how fast it runs at this tirre, since it has not been fully 
compiled, and since there are known inefficiencies in the support routines 
tor it. The best measure ia the numher of matches (unifications} and 
vanianl-tsate done in the Course of a telch, Thii number depends on the 
complexity of the packets involved, but I J much smaller than it uould be In 



an ordinary deduct ivt scheme. 

The item representation I an using is that of Buyer and tloore <1972»» 
In.uhltfi an i ten is Stored as a pattflrn and separate substitution. By 
analogy hi th function closursK, the stored sufrst 5 tut ion ie called the 
"environment" of the 1 tep. Ihe original purpose uas to sa^e Storage and 
enhance readability of formulae by representing a deduced forwula a& a 
pattern plus bindings, acquired during I tt deductive hietory, Houever , 3t 
has th* additional acvantags for PANACEA that it enables potential items to 
Oe actualized by just A+i itching their e*w lron*ients before adding than* 
Unfortunately, this rt paid for in other uaya, since special functions hav*. 
to be h- it ten to Manipulate items Stored thie May. 



IV INDEXING BY COHTIXT 

IV* A Structural ]nde*ing 

■ 

E*capt for cancellation, a 0»<t behaves like the union of the eots of 
data represented oy i ta c*t lay*r*. Therefore, it ■• a particularly ugly 
feature of the classical implementation that it finds all the data likely 
to "itch a pattern and throuB auay the On*t niflt in the current C*t, instead 
of taking the union only of the relevant sets. Excluding a layer from th* 
Current ext only deans Hiding It fron yourself, not f^on the data base 
machlnary. It has atua^s seemed that thlkti facta make c*ts fit anlg for 
toy pi-obleffB! that domain-dependent data structures or multiple CPU'l 
would be rveedrd for large data bases. 

The problem ■» BEpeciallu, pressing in vien pf our ^illingnaaa in Sect. 
]]] to have i»any items like ICOLOfl ?MELEPHANT CRAY}* noma I ly hidden «rom 



view. C i ear I y t us cannot afford in have every FETCH of SCOLDR jf ?C» f i I tor 
a bucket containing a Formula lor every type of c*ject with a usual coJori 

The firet alternative that cones to mind 1h to rehash any index &uc-ket« 
th*t have gotten too big by the layara their data are present In* Df> 
filching* the tystem takes each layer in th* fetch cut and hashes itl 

nonpar to retrieve the relevant sub-buckata, if any,. Then ft unions the 
aub-buek-ets. 

7he problan with this it that east layers hi 9 1 not mention any data 

matching the Current fetch pettirn. Of 1HB layers, moat mil mentioo no 

COLOR assertions, for example, This SChftUft is atill probably an 
inprovenent over th* original. [] tt efficiency depends mora on the length 

Of the cxt than the nu&bsr of natching data anywhere In the data base. 

which aaema better,) 

1V»A.1 flange History Graphs 

So far, ] have been treating cat layara a a pear I a of data which can Pa- 
strung a* we will. Thi* notion is uinat leavae ua- uith ao tittle atructuro 
to use here, uhere u* .icaa it. Jn actual practice. It It posaibla to 
iapose anile useful diacipJine on ths building of ewtft without losing any 
real flexibility, 

Observe first that ninety-nine psreent of all CennWar cut nan i pu I at ion 
is in terras of PUSH-CDNT£MI and RF-CCNTEKT . PUSH -CONTEXT adds u new layer 
on the front nf a cut, anc FOP just get* back what you started uith. (It 
ie equivalent tn CQFLJ Because u« Hant tn manipulate packets, let us 
generalize PUSH to xrerge-. take n c*ta, union tnen. and puah a neu layer 
onto the re-sull. Merging may ba uaed to include a packet -closure- in a cut. 
CUe are alao going to need an imp I smantat i on of packet-c losura *Kclueion. 



which I Mill descrie* later.] 

If HI rtitrict oursBlvea to merging for th* flue being, ev t r V c«t haa n 
parents Of *lch it is th* Child. Th* proceas starta m i th the *mpty cwt. 
with r»0 layers, call*cf 0. AM ont-lay.nr ext> are children of (the result 
of a Parga with n-BJ k Every; Jayer is th* fir B t layer of exactly one e*L 
0*er time, the atructura D f ewta night evolve I .kg this: 
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Jter^e Iff star/ Graph 
Flg^- B [V.i 

Notice that there nay bi bore than ena merge of thi same tug c«ts* This 

fact is oaaitally irr*J*vant,. and I wl I I ignore It. »?loulng rays* I f to say 
"the" ner-oB of tuo e*t5. 'ha n*u tayer in each tut is called ita fitijuary 
layer, of uhlch It is th* prfmary exc* 

Th* mmrga hietury graph givtjg. ua gone slruetura to work Ml th. Let Ul 
associate ulth every bucket In the I relax its key paths, each of Mhioh ia a 
(partial) descending branch through i cxt graph I i K 4 Fpg. IV. 1, (A "kHy 
path" Is a "path used as a ke^, 1 rst a vitally important path. 3 A key path 
for a bucket must terminate on a ex J which is a Buper-c*t of the cxt of 
ea^h oatjin "n vi- 3 .j C ^et. F;:r r^rsii, -, r,g. ]','.:, , : g, u;. r is -i ^ 2 1)> 
would fri a key path for a iwrttt of data associated ulth arts {5 4 3 2 11. 
11164 32 1), wd Ifl7&i32 11. Since a layer urambiguoualy 



identifies a its prinery c*t, [will abbreviate this kay oath to *0. 1. S>. 
Hither to, all duckets have implicitly nad fcey path «0> T sine* the cxt of 
every item is a descendant of 0. iAn item may have m&re thai one C*t„ but 
that's, not important.! 

Hitherto, heuever. auch a bucket nas often bean able to grnu without 
bound. Nun. uhen 1 tt sue exceeds seine threshold, let the syatem rehash It 
into Sub-bucketa, each of jhose ^ej paths it 1 longer than the old one- it 
came trom + For *x*tiple h a <fl> bucket may be broken down into <0. L> + <B, 
2> t and <0, 3> sub-ducket a. [] f any of these la arpty, it should be 
Omitted.} 

Neu say [P a} Id in layer I end IP b) I r> 2- I will abbreviate this JUS 

UP al| t IP ft^J. The system can put (IP a)j_l in bucket <JJ H l> f put IP b> 2 

In bucket -c0, 2>. and twit the <9, 3> bucket, Nou when a {FETCH * (P ?}| it 

done in c*t IS 1^ th* data-base menager can 1 Ojnwe 90M of the (P. . . I 

items altogether* It just "inverts" [6 1? (thin may be done in advance) to 

give <8, 1 + 6* t and tahea the <9, 1> sub^bucket when it diecovera the <B> 

bucket to have been rehashed, 

Clearly, thio process is extundaole. Say that HP c>c. IP dig, <P l)i, 

CP *}g i [P $\^\ are added to tnis OuCket structure, All have inverted C«t* 

starting with <0< 1....&, so they are at I placed in the <fl, 1> sub-bucket* 

Won a fetch of [P ?3 in e*t <L3 Mill got [ CP aSj, tP c) fi1 [P d) fip {P #)«,, 

(P fig. tP glgl , all but twa nf uhich are then filtered out. This leu 

efficiency indicates th* liat has gotten toe tlnj, So i t r* rehashed into 

three 3Ub-buckatBJ 

Key Patli Conisflts 

<0, 1* IIP a) lP £P B i L j 

<B> 1, S> HP c> e , (P dl G r 

*0, 1. 9> SIP f) a . CP gJ^r 

Nou if the fetch is from cxt Q), the Sfcstarr gats TCP a)^ IP s) ^ ui thout 



any filtering. If i; -s frori IS 11 . I I take* the union of the <0, 1> end 

<&> 1. S> sub-buckets ^ get KP.a}j. <P el^ {P elg, iP dJg]. 

5e long as anly pushes ar* used to make neu cyts. the fetch algorithm 

{once you n^ve a bucket! is 

la) invert the fetch c*t 

Ibl use successive prefix »lr i n-g-a of the inversion at Keg paths to find 
your may d&un the sub-bucket tree 

IeJ take the union Of tha sub-duckets found, filtermp; out items not 
actually in the fetch cxt 

Part to) hi I I proceed only so I qng a 9 the as yet unseen sub-ducket S 

generated dy pas 1 ? AQDs ar& loo big for part (c) to be efficient* 

If merges art allowed, the algorithm is much the saiic, o><cebt Inverting 

■ Cut Hill not give a single key oath. For eaanple. in Fig. I V*L, cxt IS 4 

3 2 1) lanfcs I ike 

2 




(5432 19 

Fifluri [V.2 
Therefore, we nust take into account the likelihood of hawing more than one 
uay to extend the key path to find nub-buckets, in this case, if the 
bucket is too big, the fetcher huat look in the <0 1 Is., <8, 2>. and «B* 3^ 
SUb-euckate. ana tace the union of the results. If these sub-bucks te are 
themselves broken doun, it Huat avoid finding, and filtering the bucket for 
W 3 2\ twice, under key paths <B, Z t 4> and *B, 3, 4>. Similarly, the 
Sub-bucket for IS U H) has three key paths, All the system has to do 
is compute the "inversion" of IS 4-3 2 11 iagaln. In advance, H desired) 




[B 4 3 2 1) 

Figurs IV. 3 
or qne of the other arc-del et ions that turns F|g\ IV. 2 Into a tree* Of 
qourse T the routine that fi-n[ sroates an a*ibiguouBly-naned sub-bucket «unt 
make sure that It is painted to from all relevant plgeea. then it m I I not 
matter uhich is used to find It again, 

Bafora extend | r>g this ccwicep^ let i»e sup lain what its costs and 
pitfalls. ars H First, like any table look-up scheme, it trades off the cant 
qf adding an entry against ths cost of retrieving it, lie have to Spend 
more title and space add I n-g am 1 ten if we must index it by c*t. However, 
memory is getting* cheaper, and tMa acheie ul I I work with cheap mass 
Storage [Sect* IV. CI. Further nor*, i f necessary ue can arrange tn expend 
the resources primarily on cxta 4 I ike packets* which are to be relatively 
permanent, fetched from mors often than addod to, ■ 

Second, it seers as though it couSd be a tig nuisance to have to push a 
new layer onto a cut every tin* you do a merger ~hy not just use the old 
top layer, especially if you're doing only retrieval and no new scoring? 
The aneuer is, ycu can: 1 just uanted the one-to-one correspondence between 
lagan and cuts for purposes of Clear exposition* Since it' a the complete 
cut rfiveraion that guides the f etcher, it doesn't Batter whether tuo cwts 
ahare the sane top layer, {Of course, tuo Cuts sharing a pri«ary layer can 
have side effects on each other. I 

Third, there are "pathological " casus where this scheme will not uork 
any better than the primitive union caa* (and practically reduces to it)* 



If a c«t ■■ aha lieu and uide, like this; 
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Figure IV. 4 
and ita item are aparseiy distributed anong i tt layers, then «e are back 
to wit true tursd unions. However, this is not a very cowan caae. because 
packet -like cxts tend to have been built ty merges. Thus the packet for 
"elephant" will include the nansial packet: as ull I the packet for "wombat. 
3f B9th elephant and wombat pacKit-cloauree ar B added to the Current C*t t 
they ui I I not be children of B: 




<6S elephant uonbat 
nattmal 11 

Figure IV. & 
{Here and later, I use, ft,g. . "b levant" to- mean the primary I auar of the 
elephant packet c*M In general, 3 expect there to tit a feu *utl 1 1 ty 
packets" for a giver domain ("body; part," for example, in one domain, and 
'electronic component" in another) that Mil I serve as Strong keys, for 
pursuing auD-buckeiB. Even- though most D f them ui I I drau a Ci I an* for a 
given pattern, thay ull I then hide thair sub-c^ta (e.g.. the c*t H for 
an-inal, or circuit! whose parts they retjreBQnt) h- M being used aa key oath 
element* at al 1 + 

It may be possible to avoid this problem by artificially ensuring that 



no C*t haa mors than sone anal I renter N of children. Uhenever a c*t 
acoulres N+l children, the syslcrr, could insert durmy layers between it and 
■ t9 children to group then into fewer than N groups,, each with fewer than N 
adders. If thn data are sparjclj enough distributed, soma of the 
intermediats Cxts will drau blanks an;l save us son* work. However, ] 
predict this is more trouble (ban it is worth. 

A good remedy for many problens of this kind is to store a complete 
list of the sub-buckets of a oucket and throw out the Sub-buckets with bad 
key paths ingtead qf f ?ntfi>ff th-ose with good ones. If the data are 
distributed bo that they all fall into 2 or 3 sub-buckets out of a possible 
500 key path extensions, it is worth looking at It this way. 

To see hou thie remedy night apply, can aider the case of the Creation 
Of a huge sat of data matching a pattern^ in just one c*t that il th* 
result of many pushes: 
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Figure IV, 5 

The problen here is that, uhsn fstching from (4$9S *.„ It, th* Indtmr doet 

not know it i s on a wl Ed-poose chase unti I the last step-. There are always 

lots of data still around down there, ^r there Mould be Just one bucket to 

be filtered the bid uay. 



Thi* prablen ui I l to solved if ue sake a not* in the bucket for cut !1) 
that there la only on* non-enpty Bub-bucket <or tMb or three! anynhere 
bolon It in the graph, [f sd. i* is better juat to keep mat Ducket around 
and test its. key paths against the fetch cut. In fsM, 1h<-r >s no rnason 
to build thH actual graph until thu number of non-enptu, bgckpls cwcaods 
nam» threshold, and than only the first coup I a of forks need to be built 
immediately. Cansetpjent ly, we can modify the previous algorithm to include 
the possibility ttiat a bucket he* f say. four or fewer bub -bucket*., which 
are juat filtered a bucket at a time, not by individual Item* Tb» graph 
atructur* exlttt only to guide the indents* to tbii *itu3tion or the old 
case of orvfl a.ta I I bucket a& yat unrehaaned. IThere ■• no difficulty in 
interleaving layers of graph and amorphous bucket list.] For exarrple: 
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Figure IV J 
Here, from (II there are eight big bucfcata, ton nany to examine on* by on* 
1 1 at Ui assume], However, If the fetch cxt ia (4959 LJ . 1), after one step 
dOHb the tree there ia only One Dig bucket left to Horry about, the one for 
(5BB0 . ., l) t uhich can de curroa-ed uith {*333 *.. 1) and Ignored* 



IV,A, + 2 Excision of ChT Layers 

To ittpleeent packet-closure exclusion, he have to have a May to remove 
cxta from a merge. The- way ] have developed, a I Though soaeuhet 
unintuitive, serves Mill purpose well enough, [] t is » unintuitive th«t 
you may yant to akip thia section, if you wish to take *y Hard for It that 
Cut layer eKciaion 19 conpatrble with cwt indexing.) 

A ffrst guess at how auch alterations ought to work ia to generalize 
tht Uflual POP-CdNT-KT c-peratiflh, and a Mom any layer to be remold fro* a 
cut. However, thl • Turns nut to ba not as useful as the seemingly more 
Indirect method [will describe, for two reasons- First, we want the 
Operation to implement packst exclusion. Since there: may be more than one 
closure D f a packet in a given c*t land a ajivan layer may be in none than 
ore packet J ue *lust solve the problem of how to 'undo" just ona inclusion 
Of a packet, removing only those layers with no othar attachment to the 
cjct. Second, the alteration method must preserve the usefulness of any 
index structure that may have been previous I u built on a cxt. 

Therefore, It is desirable to represent an ejeciaiop i r* terna of 
alterations to the nerge history graph. Let ua augnent the usual mergo 
operation uith the Utik cuti a notation to the affect that one edge of the 
graph is to be invisible in ths result of the marge and c*ts later derived 
from it: 
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Figure IV. £ 
in Fig, IV, 8, layer 7 tiaa been pushed onto (S 5 ^ 3 2 L) fc and layer 3 has 
been deleted. However, deleting a I in*, doaftn" t aluays have any immediate 
ef f set. 




(6 S 4 3 21 
Figure ]Y.3 
3n Fig* 3V + 3, faysr 1 cannot oe flushed from tjtt (54321), because ther* 
ia another link "n" to It from 14 3 2 1), But (& & 4 3 21, Sjortnrated by 
Decision of "*, " hat lost layer- 1. 

Sonnet itiee deleting a link removes Jic-re than cne layer: 
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Figure IV, IB 
Here,, anapp i ng the Mr* removes the last tie to layer 2 as uall a* fsyor 4, 

Unfortunately, there r*ay be mora than one path ta a I i nk Suhich 
CDrraaponda tq itt oting part of 5 tu I ce- s nc I udad packet). In thia 
nuat apecify Nhich or>e Me mean: 

J^ 
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Figure IV + 11 
Hare, ua mean to do I ate one "occurrence" of the link H * t H at detararnad by 
pta aubaaquent history. The cut It specified by i ts targe! link "*■" and 
the path taken to i t+ Since there remains another path, lasers 3, 2, and 1 
remain* 

In general, the -u^e ia that Cutting a I imk duringi 4 *trj* cauaes the 
•xcia-ion of all layer* uhoae primary, tuta can no longer be rtashed from the 
result of th» merge, A c*t cannot ne reached onlg if every path to every 
link immediately belou it ia put by soma node on the path, 



In Fig. [V.12, thure are f*gr paths to link "* H fro* ffl 7..>1K 




[S S 7 S 5 4 3 m 
Flfliirs [Y.12 
Only tuo ef the paths are Cut in producing <3S7G54221) F so layer 1 
retrains in the nerge. 

If a cut (Jifigraii leaves a p-fllh anbiguity h ny convention i» that a I I 
path* to the cut link are alHyltaneoulIyi flushed^ 

hare is another eaanrnle gf a irBrged and cut flraph: 
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Flgu*e [¥ + 13 
The reader -should, convince hinaelf of the GOrrectneBa, g-fven the serge 
hictery graph, df each «f the cxts i rr Fig. IV. 13. 

Now that link cult'mg is urde-^toad, [ am In 9 position to deacniCia i te 
advantagea in mora dUta i ]. firat, It it complete, in the aanae that a cat 
with any Bet Df layer* can be generated ay a merge uith appropriate cuts. 
Namely, merge the primary c«'.» of each layer, with a wt -of e^try 1j n fc 
bet He-en c*ta: 

a 




Figure IV.L4 
Second, as promi >id P I ink -cut ting it a natural imp lerrentat ion of packet 



txelulion* Far AKaiiple,. fey CI yds and Bonny are elephant*, and svory 
alophant has an LTUSK and an FTUSK. 
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Figurq [V.1S 
The packet -closure [TLISK-PKT [RTU5IC CLYtlE} ) can be re roved by cutting 
the graph thia uayi 

a 

tut*. 




[3B 2$ elephant tusk 1} 
Figure IV.1G 
Thai third and tiaifl advantage 1 3 that ua can use tne aama induing, 
strategy as for uncut graphs. All ue have to be aula to do is invert a cut 
with spue Cut branches. In Winy caea such as FEg+ IV, IE. this is trivial. 



sinca a I I us have dona , s pryr* soma redundant branches. 

Problem* arise u.th cxta like (10 5 4) of Figu [V.13. Inverted, I tl 
graph It 
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Figure IV. 17 
uh*r# the dotted areas grg not tnere any Sore, Qn fetching from thin c*t t 
the 9ySti*i needs a nen +iay to extend (he ^ay path <B>. 

Tht armrest approach la do retrieval the same as bat ore. but to ignor* 
thft sub-tyckett nith key paths <B. l> t <& v t< 2*, and <0 h 1, 3> (if they 
have been rehashed), and just use that* to gat to the «;0. I. 2, 4> aub- 
bueket. A solution like tlile nay be painful if ue have to fn-Mou a long 
'phantom' trail I I Ko this t pnJu, to find an empty bucket at the end* But 
there ]« no reason not to hang the <:&, l k 2, 4> aijb-aucXet directty trow 
the <fl> bucket for future reference, and give it the neu k« u path <0, 4>. 
Cwt [4 3 2 1J becomss a "paeudg-chr Id" of 6. The link from the <0> bucket 
to the <0 f 4> sub-bucket li onty need for sub-c*ts of 110 5 4] and ita ilk. 
Wot ice that the Bye tern must do th>ia for every bucket In the I nde* rehaafted 
on thee* cxte; having organized (be < COLOR... t bucket this uay ui \ I not 
■halp Ljt an a fetch of {SIZE,*,).) 

The implementat lor. g* c«t exclusion in compatible uith the trick of 
associating with every bucket a [let of Its aub-buefcete, to be ueed to 
9beed up searches far isolated aub-buckets. TSact. ]V.A,1J Pseudo-child 
links do not alter th* tub-bucket I let, but merely shorten the patha to 



soma sub-buckets in emie cirjun^tanceSr 
IV. A. 3 Cancel lation 

So far, [ have treated the Cthce I I a \ i on of a cmarker <&e t t> II. A. 21 ee 
an infrequent occurrence* It is handled by tht Btl I l-ne»taaru cut 
filtering step that is dona en tha final output from the index. This is 
proper if only a fau data are cancelled in a particular cxt. 

However, there may bm caaea in which cancel la-t inn is * D frequent that 
tha syeten ] havs outlines works less wall. {] t can't do uorae than th» 
current el cjOri thr. ] For example, &Esurie again that c*ts are being used to 
model auccegaiva situations, (Jhoae appro*] vat* time is represented at items 
Of form [FlrlE £k 4 sequence' flight look like this: 

■ 

CXT1 CXT? CXT^ CUT* — — ... 

{tins , _}j ttiBB afternonn) 3. 

mornlngi |ti„ e X 

lunch! ■ 

Figure 1V.JS 
The K*» (nark the cancellation of an I ten. Each auecesaive c*t i H formed b V 
pushing the orevieua ana. "erYBaquenUu, a FETCH of [T]f£ 71 in CXT4 Mill 
union the suOMiuekflts of tha «T|rlE, CAR:» bucket for CXT't 1. 2 h 3, and 4. 
Only one of the three items found ul M survive the test for cancellation. 
By making tha string nf e*ts and items longer, the efficiency can be nade 
arbitrarrjgj Eou, 

n U feeling i« that in practice this preplan ul I I not be eerioua. 
becauae the only ok* 5 that get fat annugh to require tub- indexing are the 
more or lets permanent nnea that are planned too cararu]y to be the result 

Of a long string of cancel !ati ong. Aa dl icuased in <:HcDer r»a 1 1 , 197^a>, 
thare 1> a need to reorganize and epi ionize c*t aequence* periodical 1 u If 



they are to represent Song stretches o* I I h& in a useful way r During such 
a BummSry, the cansc I I at : ons tend to be le^t gyt + 

However, for Curious purisms uho tfliflk It important that an algorithm 
a I inlays wgrh 1 there ar( usys of avoiding aone of the trouble. If you think 
of cancellation ,15 a complement operation aup*r i hposed on the union of eub- 
buCketS for different layers, the probleii is that taking One erg eet from 
another can given an irri tatingly mall return from a targe investment 
(nhilo an CKpcnsive union is always urjrth lt) r Ones the system has done 
Such a ComplertBntat : on, ria^eve-, it might 9* us I r try to save the result 
for futur-E reference. For example, in, tha «TItE. CAF» Bucket for CXT^ 
us can (after one reference} nark the sub-bucket ( [TJElE AFTERNDDNH as 
"complete*" meaning fhaf the buckets above it should be ignored, not 
uniOfifld and filtered. The problem qi !h this approach is that some 
bookkeeping is requires to unnark it if an item is added or remodel fro*» a 
super -bucket. [One scheme; ewtruj bucXet could be narxed ui th its "time at 
completion"; a complete bucket uould 5e accepted only if Its tiras of 

coaplotion was aftsr the couplet ion tir*t* Of a If ite super i ar-s . J 

■ 

IV, B Interaction with the TDBH ] ndixar 

I have referred vaguely to the rehashing of certain "buckets" into 
"Bub-buckets" by cwt structure* |fi Sect, 13 h [ described an independent 
bucket-Eub-oucket system based on pattern featurEa. It haa to 6e decided 
how thaee systems should Interact. 

Logically, there ie no particular probleH. At each step in a FETICH* 
the indewer haa a bucket anc a renal ni ng set of fea'.ursi and cute that it 
hasn't used yat f If tha bucket is a ran list, it takes that. If it is 
broken down by features, it uses the teatu-es it has; if by cxt, it goes 



another Step d&un the inserted fetch cwt. After each Step, it has a new 
bucket and less r earning key naterial. 

The problem ia purely strategic, When an oversized bucket needa to tie 
rehashed, there Is a choice as to which of the renaming keys, to rehash en. 
For enaPplet lf CFETCH MP A ?}) findB tno many iterta, ahould It rehaeh the* 
«;P, CAFb bucket by ate-Ns or by cxts? 

Notice that ma iiatler hoy long a bucket in, it ia "too long" under juet 
two kinde of circumstance: either the ex* filtering step or the «iatch 
filtering step rejected too large a fraction of it* input. The proper 
rehashing strategy, therefore, is to rehash nn the key corresponding to 
which nf the tuo pre* I ems actually occurred. Ihue, if there ara twenty 
items in ths «P, CflR>> Ducket, hut only one ia in the fetch c*t (4 II, tn* 
ayetet* ehou I d rehash by cut end take the «<;P, CAfiS. 1> aub-buckat. If 19 
af the i te*S are In the cxt. but only nna nr tuo natch (PA?*, It should 
rehash by features, and take the «P, CAfl;^ <A, CADfl» aub-bucket. 

Of coLirae> there art exceptions. If the cxt filtering step' a 
inefficiency ie Sue to a lot of cancelled items, different remedies are 
called for, (&ee Sect. EV.A.3.J 1+ a bucket Is already rehashed, as is the 
«P h CAR> + <P + C*fii> sub-bucket nf Fig. Il.& t there ie no point in 
rehashing by features, The inefficiency car happen in thia caaa only If 
the fetch pattern ia aoriietning |Lke 4P ?K ?K) and thera are toe many i tema 
Of the for* CP si sZ) with CJ«s2. [I do not know if thera is a way to 
handle this cese efficiently, dr hcu important It it.) 

Uith this scheme, the systen can start cut with or>a bucket for all 
data* with key <>. It treats this top~lav«l bucket ths way it would any 
other, rehashing It by relevant key when it gate too big* 



lY.C Secondary Storage 

An advantage of the c*t indexing gcberte OV ar the JJurn cwt fl Itering of 
the TDW1 is that it appear* to at Compatible m! th efficient use of ilouri 
ncn-random-acceas atpraga. Because we do net actually hava to see I terns 
not in the current Cut. they tan be purged mjt onto a disk or drum. 

Hauler, u »e of secondary s^age ui 1 1 require development of a record 
I/O system for handling item data. It uitl b* rmposalble to use LISP 
addreaaes h since LISP" a addro&a apace is so small [and since aoet LISPs 
make a mess of sacandary storage use), ]n particular, t doubt that Syttbole 

can be Storad 3a USP-like atert*, unless there is a Urge Pdarray-llkS haah 
table on the disk tor adulating LJSP'e Obarray. [Presumably, the 

analogous "unjquizing' function for item, tailed DATUM in Connlver, ul II.. 

have to 0* foregone, since the system can 1 t afford to Search the tfisK for a 

variant of a neu rten.] 

- 

So there are tuo approaches one can takei either ual t for a LISP ui th a 
large, efficiently organized address sp-ace, or debug the following hackx 

Assume aymbolt are represented by character strings. The system mi I I 
have to use LlSP'a flEAQ tor a streamlined version] to read i tana from the 
diak. Of course, ue do net ha^c to specify all of the characters In an 
item's printed repres en tat i on, but only those left unspecified by the kaya 
to the bucket the characters uere found in. F r example, if the «P, CAR» 
bucket of Fig. 1 1.4 were bitten to the diak, it COuid be stored as Ua) t 
tbl. Ic>, !d>. {eM [plus c«t markers! > Furthermore, the system can save 
time by postponing: the actual AEADing of character e uhen they are brought 
in, -until they are needed to reconstitute a bucket. That ia, there Is a 
buffer step n-atneen asking for a bucket and reading from the diak: 



askg. for at*! far 

INDEX-FETCH t BUFFER <-— DJ£K 

bucket record 

V * indicate* data flow 

Figyrfl jV.ig 
The moat crucial requirenent for such a system is that H respect 
locality of reference with respect to cxt. That is. when one bucket far a 
Cxt in brought in. Other buckets ror different Statures in the sama c*t 
Should be stored in the sane records. This can be achieved by keeping a 
list of all bvckats and sub-indexee directly aaeociated nith a c«t layer* 
and siting them out together. (Thia ui 1 1 be cheaper if buckets are broken 
dDtm by CKt before feature^ cf. Sect. IV. B. i Since the indayer needs the 
Hub-bucket* for alt euper-exts when FETCH in a from a cxt H a FETCH fro« a 
long-neglected c*t night cause several knc t branchae cf a sub-index to 
brought in from disk. 
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Figure TV.20 

In Fig. EV.20, the items shoun are written out in order (P a>, IP b) + CQ 

C*. ID dl, {P el. (Q el, CQ f] A reference to (P 7) in CKt IlJ ui I I 

anticipate a reference tn [Q ?i by bringing (Q c> In. 

Th* conplaylty Of the issues regarding- secondary-storage management ia 
the uairi obstacle to implement ing the cxt inde«ing syaten ] have described. 
Uithout taking secondary atorage ir'c account, an imp I enentflt i on would 
begin to thrash from external reasons (limitations of PDP-Lfl n a ant* LlSPa) 



before Me advantages began to telL 



V SO mAT? 

Tt»e reader should by nnu be comvincad that it la passible to Implement 
larga Planner-type data oaaea efficiently* if they can be MCl I Organized. 
The time required to. fetch a pattern from a oct depends on the size Of the 
pattern and the length of th* Cat, and to a snail degree on the number Df 
closely related data that might seriously confuse 1 mora standard system. 

]t uould be nice if I could prove a theorem regarding the average cost 
In time antf Space of using a su/steti ike the one t ha^e described, but I 
lack the expertise to do i t. E find it hard to ioagtne cases In which its 
behavisp would be unacceptable, but 1 air frc-pao I y overlooking son* peculiar 
interarticna. Host of the interactions ] eea are actually beneficial. Far 
e*<jffpie, the acre feature* in a pattern^ the mors aub-JnoeKes to look 
through, if there are a lot of data, On She other hand, when a pattern hat 
3 lot Of featuree, there are unlikely to He near duplicates of it in other 
cats,, so the bucket tnat is ultimately found u 1 I I pr-ooably not be rehashed 
by cwt at all. So performance \t probably aeldom degraded on both tynee of 
key, I can't prove to what degree thla is true, houiver. 

7ha case is even mnrE confused with my packet 1 lip I e*entat 1 on, uhich 
relies on the organizing power »f the frane concept* Her* what ie needed 
It experience, UiicTi I hope ue ul I | all won have more than enough of. The 
■Mt pressing question ut Should he asking i|, hQu uel I does * program have 
to organize itself to avoid choking on irrelevant packets or frames? 

Brute-fores efficiency is thua a subordinate leaue. A more interesting 



causa ] would like to rally tfl Is the defense of Scott Fahltian'e early 
paper "A Hypothejit-Fratie System for Recognition Problems' <;1Q73> from the 
critique in his recent "A System for Represent inoj and |>p i ng Real -World 
Knowledge. K <1975> The first paper uas an elegant exposition of the 
developing frame theory, [t is consistent with the uork of Ilinsky <1374>, 
Kuipcrs ■:137S>, u'inograd *1974>, Rubin <1975>. flarcu-s ^1575?, and floors and 
Nenell <1973>, mpst of uhon clain to be working on frames. People nho 
admire all thl s re search can see how tha median 1 S*i* 1 have described could 
be help fu I . 

Or maybe they ean T t, One problem uith my exposition that makes it lDDk 
different froN previous frame theories ib that [ haven't ner.lioned 
recognition at al L flost of the theories er* about nothing Cut 
recognition. The computational problems all revolve around guessing a 
frame to account for data* filling in that frame with data, proposing 
SubframeS, building super -franea. trans forming to new frames, ami 90 On* 
The only apparent purpose of a frame is to embed it in a more Inclusive 
f raie. 

Probably this is a minor oversight. Lie are expected to sea by 
ourselves the nany advantages Of finding an instance of a large "almost 
right" ChunH of data like a frame. For example, after a medical diagnosis 
^Rubln, 1975>, a doctor program has a atructure of frames representing a 
hypothesized constellation of diseases, How does 1 1 kneu uhat treatment to 

■ 

prescribe? Presumably, there is a frame slot TREATMENT - AMPUTATION, Dr 
something* uhich can juit be read off. 

[ndee^ 'jus* reading it off" sesns to be the retrieval mchanism frame 
theorists long for. A frame represents a body of knowledge that is well 
Understood. "A trs*i* is a specialist In a small domain. - <Kuipera, 1^75,, 
Ph 1S£> This is dhy typical frarae contents tend to be slots and values* 



w i thi all other information represented as auxiliary procedures, indsed f 
the slot-value system reninds *l« of nothing so mjch as a called function 
with all Its variables bound [with son* slot* being hPDdsNed as optional 
arguments with default vaiuaE). Kuipars' fratie ir»p le*ien tat ron Is just like 
this. Hig frames am processes uhich c air hi. n i ca te oy sending each other 
requests for slot, waives and the I Ifce. 

All of this c nic en tea l i on &n slots, values, and nessagc passing makes 
hie unhappy, Uhy did all these osople abandon the P lann C n-lypo data base? 
There is no real difference conceptua I ly in noM S Plartner-typs index storci 
slot-value pairs. Clearly, if all assertions are of the forn ICDUuft FED) 
or {TREATMENT ArFUTATIQNl , it doesn*t matter uhelher they ars stored in an 
a-liet or a C*t + The problen lies in represent Ing more complex places of 
information, like "the Hay to spot a gnurd la, that hla socka don't natch." 
UbI I . thi s is a problem; it might be tire A3 probiati. 

Let me S'-insest that a mos? important feature of a representation of 
facts, like this Is that It oe obvious Mh&t the fact ia and why it is 
believed and uhat might have been believed In it* place and how it mioJifc 
have faeen used, as wall as hou it fj being used. Unleaa a program is such 
an expert that it knoue t by reading ff a B | D t, H hai to do in any 
conceivable situation, someday It is going to fail to solve a pnoblen and 
have to debug lie nor Id model, [Cf. *Sussnan, 1975* J ]t Mill have to start 
from the debris af its previous efforts. |f th"n defer i* consists only of 
prograti counters Of processes confused ay. conflicting slats,, it ui 1 1 be 
useless. A pro-gran that relies eonfaletely on passing mes&ages of 
anticipated form can have onlg unpleasant surprises. Intelligent people 
are pleasantly surprised by P^Oblehs all the time, 

Angway< packets and cxts can support all the usual uaus of *oing 
recognition: let the data suggest a fs-ane; fotus attention by activating 



only a few frames at a tire; >- e |y on "detiona" \ i f -added interrupts, *or 
example} to fiodel "noticing,"; let the frames guide the saarchj allcu 
default fact* which are easily replaced. Hopeful I g th#u Mill support 
whatever other functions f rases are going to h4ve» 

Fan I man' * recent paper, "A. System for Reprtsent i r*g and Uaing Real-Uorld 
Knowledge -1 il97!J> repudiates huch of this frane tradition, and advocates 
tha uae of la>-ge parallel networks of "concept nodes." for Storing 
information and doing recognition of unfamiliar osjecta. Fahlnan claims In 
this paper that relevant slosh* of a large ae-t of data cannot be 
efficiently retrieved from a typical A] data structure; if 'relevance" can 
be modelled uaing cxts (or franee}. I an confident this claim is. feist. 
HI* other claim ie that the utual frane-theoretic approaches to recognition 
I mentioned in the preceding paragraph are not aj pouerful as doing 
parallel intersect lone of large a *ts of nodes, each representing the 
concepts with a given property, in order to suggest a familiar concept that 
might share all of a set of properties. These intersect ions* which require 
special-purpose parallel hardware, can he made to solve Other well-known 
propleiie. For example, disanoi guat ion of co-OCCurrent terms like "pitcher" 
and "diamond" can be accorrp I i shed by i nte-sect i ng thoir eete of possible 
contexts. 

Thia prcoosal at.i-ing in direct opposition to the usual mechanisms, 
proposed by frane theorists to do these tasks. Each theory makes different 
assumptions about where the conplcKity ies, The 'fund amenta? assumption of 
frame theory is that a problem solver should neet a difficulty byj finding 
an organised body ot know I edge that appiiu to it. For exaaple. Mar cue' a 
<:LS75> "wait-and-see parser* mis knowledge aoout choosing Between parses 
to avoid having to back up. It oqesn't have problems requiring farge 
intersections for solution, because it aiuaug phrases its recognition 



problem! in tern? uhich it Uncut are correct 4 

A language bus ten Knowa that "Fido" JS a noun, but it a I an fc.non» under 

what c i reus St area b to do pa t tern-na tch i ng on "Fide" [iau h in the 

phonology), and when to ca 1 1 it a noun lln the syntax). It uoultf be 
PCS* ib It to do Fahlmanaaque "narker sweeps* to implement de*ion call a 

triggered by atructureB the sustim builde. t>ut r rjivan these guidelines, 
which ere needed on indaoenaenl: linguistic grounds, there art cheaper uaye. 
Anothir exanple comes frori Ualtl'i <\^TZ> research on vision of acenea 
uith shadow Hie method would seen to benefit greatly froi parallel 
marker Sweepe,, until closer Bitanunat i on reveals that enough 3.1 ftp I a 
knowledge has bean draught to Uear on the prtibLei to nek* the parallel ism 
unnecessary. 

]t saems at present that ths evidence favura frane theory" a account of 
the complexities of Intel I igence, hut the issues ara too COflp I ex for a 
verdict now- ■ At any rate, riata pas* efficiency does rtdt aeem to b* the 
isa-Ufl, 



AuknowTstJymertts-- 

Thie paper \* the result of conversations with Scott Fahlman, Bop 
floore H and David Harr + The est indexing scheme of Sect. tV is a descendant 
of Hoore'a solution tc fne Synool- napping prcPlam. Carry Suaaman and 
others a« referenced are responsible for nost elements of the TDBfl. 
Suaaman. Ronald Rivest, and others tiada helpful sug flBa tinn* for improving 
early drafta of thia paper. 
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